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Method for reproducing audio documents 
with the aid of an interface comprising document groups 
and associated reproducing device 

5 The invention relates to a method of reproducing audio documents on 

the basis of a reproduction apparatus, and a reproduction apparatus furnished 
with a graphical user interface allowing selection. 

The storage of a large number of sound documents within mass-market 
equipment is known. The reproduction apparatus is fitted with an interface 

10 making it possible to easily retrieve the document desired by the user. The 
reproduction apparatuses are for example, personal audio CD players, personal 
players containing a hard disk (such as the MP3 Lyra model marketed by the 
applicant) capable of storing 300 hours of music, players for the home with 
display and remote control, personal computers with screen, hard disk, CD 

15 player and keyboard. In all cases, the user must introduce the specific identifier 
of the audio document to be reproduced. In the case of audio CDS, he must 
program the number of the CD and the number of the piece within this CD. In 
certain cases, the reproduction apparatus is fitted with a player which displays 
the identifier of the audio document currently being reproduced. For example, 

20 the Lyra MP3 player has a small LCD screen making it possible to display the 
functions selected in the form of icons, and the numbers of the audio pieces. 
Home equipment has a hard disk of large capacity, 20 Gigabytes for example, 
thereby making it possible to store thousands of sound contents. The graphical 
interface consists of a large screen making it possible to display more 

25 information, the complete title of the piece for example. 

According to the type of interface, the selection of the sound documents 
is performed through a number or through an identifier within a list displayed on 
a screen. With the growth in storage means, the number of documents to be 
stored is more significant and therefore, the user may spend some time 

30 searching for the one in which he is interested. When information in digital form 
is associated with the sound documents - referred to as attributes - the 
reproduction apparatus can create groups. The attributes of the audio 
documents are for example the genre (classical music, pop, choral, jazz, etc.), 
the title, the producer, the singer, the publisher, etc. 

35 By determining groups possessing a degree of musical unity and by 

displaying these groups with the aid of an identifier, the user can firstly select a 
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group then navigate within it to search for a piece. The identifier of the group is 
then the common attribute shared by the documents. 

However, certain audio contents accessible to a user do not 
automatically possess these attributes, for example when the user records his 
5 musical pieces live himself. 

In this case, another way of classing audio documents is to analyse the 
sound signals directly. Signal analysis techniques exist which make it possible 
to calculate values of so-called "low-level" parameters for each audio content. 
These parameters are for example: the tempo, the energy, the brightness, the 

10 envelope, etc. They are determined by analysing the signal either in its digital 
form, or in its analogue form. A technique of audio content indexation is 
explained in the article "Speech and Language Technologies for audio indexing 
and retrieval" published in August 2000 in the IEEE Journal page 1338 to 1353 
of Volume 88. The article explains how by analysing the audio signal it is 

15 possible to classify the various contents. Other articles describe means of 
calculating low-level parameters and possible uses, here are some other 
articles included by reference to the present patent application: 

■ B. Feiten and S. Gunzel, Automatic indexing of a Sound 
Database using self-organizing neural networks, Computer 

20 Music Journal, 18 (3°, 1994 

■ Eric Scheirer, Music Listening systems, PhD thesis, MIT Media 
Laboratory, Apr 2000. 

Once the low-level parameters have been determined for each sound 
document of the collection, the storage or reproduction apparatus can class 

25 them groupwise as a function of these parameters. Thus, the classical music 
contents may constitute one group, likewise the jazz pieces another group. 
Patent application PCT/G B0 1/00681 published on 23 August 2001 describes a 
user interface consisting of a graphic displayed on a screen and controlled by 
an audiovisual receiver. The menu displayed exhibits icons ("classical", "jazz", 

30 "chart music", "talk back", etc.) selectable by the user, the selection of a 
document of the group activating the reproduction of its sound content. The 
identifiers of the groups may be introduced by the user as a function of the 
documents contained in the group at a given instant. But when new documents 
are downloaded, the identification of the groups must be able to evolve so as to 

35 define the group better. Moreover, if many documents are assigned to a group, 
it may be beneficial to split it into several groups to obtain sets of documents of 
average size. Such an operation compels the user to redefine the identifiers. 
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Japanese patent JP07-044575 discloses a method of vocal recognition 
making it possible to process vocal documents or vocal sources and to place 
them in a video. The vocal contents are represented in a space ("sound field 
space") by symbols that can be selected with the aid of a mouse. The user 

5 moves within the "sound field space" with the aid of the mouse. The documents 
are grouped according to a hierarchical structure. When navigating in the sound 
space, the volume of a sound of a document is inversely proportional to the 
distance between the user placed in the space and this document. Therefore, 
all the sounds associated with the documents of a group are emitted, this 

10 superposition of sound does not facilitate navigation and selection within this 
sound space. 

One of the objects of the present invention aims to offer the user an 
automatic means of classing the documents into groups and identifying them 
easily for the user. Then in an effective and convenient manner, the user 
15 navigates from group to group, as well as within a group. 

The subject of the invention is a method of reproduction within an audio 
document reproduction apparatus characterized in that it comprises the 
following steps: 

20 - partitioning of the documents into groups of documents possessing at 

least one similar audio characteristic, 

- determination of at least one audio document representing each 

group, 

- positioning of a plurality of audio documents in a space, the 
25 positioning of an audio document being dependent on at least one characteristic 

of the document, the user occupying a position in the said space, 

- reproduction of at least one identifier of a document representing a 
group, the reproduced identifier or identifiers having a position situated at a 
distance less than a determined distance with respect to the position of the user 

30 in the space. 

In this way, the apparatus itself determines the groups of audio 
documents and at least one document representative of the group, an identifier 
of the representative document or documents being emphasized in a graphical 
35 and/or auditory manner for the user. In this way, the user can take note of the 
type of music involved and can select this group and elements of this group so 
as to reproduce them. According to a first improvement , the user can activate a 
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command making it possible to go from one group to another, the identifiers as 
well as the documents reproduced are automatically updated as a function of 
the current document group. According to another improvement, the user can 
by activating a command reproduce the documents within the group whose 
5 identifier is reproduced. 

According to another improvement, the method comprises a step of 
representation of the documents in a space whose number of dimensions is 
equal to the number of audio parameters, and whose documents are associated 
with points disposed within this space. In this way, the determination of a 
10 document of the group as representative of this group depends on the distance 
between the equibarycentre of the points associated with the documents of the 
group and the point associated with this document. The document whose 
associated point is closest to the equibarycentre is regarded as representative 
of the group. 

15 According to another improvement, the method comprises a step of 

projection onto a space of determined dimension of the points associated with 
the documents of the set and possessing as coordinates the audio parameters. 
In this way, the set of documents can be shown by representing the projection 
space graphically. Moreover, the calculations of distance between the 

20 equibarycentre and each point associated with a document of a group are 
simpler to calculate. According to a variant, the points of the representative 
documents of a group are situated at a distance from the equibarycentre lying in 
a determined interval. In this way, a single document does not characterize the 
group but several, which surrounding the equibarycentre enable the user to take 

25 better note of the genre of the group while appreciating the diversity thereof. 

According to another improvement, when the user has selected a group 
and when he reproduces the documents of this group, the order of reproduction 
of the documents consists in commencing with that whose point is the closest to 
the barycentre, and thereafter in taking those situated further and further away. 

30 According to another improvement, a document regarded as 

representative of a group possesses low-level parameters whose values are 
close to the average of the values of the documents of the group. 

According to another improvement, if several documents are 
representatives of a group, the reproduction of each of the documents is 

35 performed sequentially during a determined period. 
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According to another improvement, the reproduction apparatus receives 
the values of the audio parameters. On the basis of these values, the apparatus 
determines the groups and the documents representing these groups. 

5 The subject of the invention is also an audio documents reproduction 

apparatus comprising a means of command introduction; characterized in that it 
comprises furthermore a means of calculation for partitioning documents into 
groups of documents possessing at least one similar audio characteristic, a 
means of determination of at least one document representing each group, a 

10 means of calculation of positioning data associated with each document in a 
space, the data being determined by at least one characteristic specific to the 
document, a positioning datum also being assigned to the position of the user 
within the space, a means of selection of at least one document representing a 
group, the selected document or documents having a position situated at a 

15 distance less than a determined distance with respect to the position of the user 
in the space, a means of reproduction of at least one identifier of at least one 
document representing a group. 

Other characteristics and advantages of the invention will now become 
20 apparent with greater detail within the framework of the description which 
follows of exemplary embodiments given by way of illustration and referring to 
the appended figures which represent: 

- Figure 1 is a block diagram of an exemplary sound document 
reproduction apparatus for the implementation of the invention, 

25 - Figure 2 is an array associating for each document of the collection its 

values of low-level parameters, 

- Figure 3 represents a projection onto a two-dimensional space of the 
points associated with documents belonging to three groups, 

- Figure 4 describes a screen shot presenting a screen background and 
30 an interface for selecting the various groups of sound documents, 

- Figure 5 is a block diagram of an exemplary sound document 
reproduction apparatus according to a second exemplary embodiment, 

- Figure 6 describes a representation of the sound space in which the 
user moves around according to a second exemplary embodiment of the 

35 invention, 

- Figure 7 describes a block diagram of the audio interface according to 
a second exemplary embodiment of the invention. 
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We shall firstly describe the manner of operation of a multimedia 
receiver 1 associated with a device for display and reproduction of sound 2. The 
receiver comprises a central unit 3 linked to a program memory 12, and an 
5 interface 5 for communication with a high bit rate local digital bus 6 making it 
possible to receive audio and/or video data at high bit rate. This network is for 
example an IEEE 1394 network. The receiver can also receive audio and/or 
video data from a transmission network through a reception antenna associated 
with a demodulator 4, this network can be of radio or television type. The 

10 receiver furthermore comprises a receiver of infrared signals 7 for receiving the 
signals from a remote control 8, a memory 9 for storing a database, and 
audio/video decoding logic 10 for generating the audiovisual signals dispatched 
to the television screen 2. The remote control 8 is fitted with direction keys 4\ 
4", -> and <- and "OK", "Group", "sound documents" and "Select" keys whose 

15 function we shall see later. 

The receiver also comprises a circuit 11 for displaying data on the 
screen, often called an OSD circuit, standing for "On Screen Display". The OSD 
circuit 11 is a text and graphic generator which makes it possible to display 
menus, pictograms or other graphics on the screen, and menus presenting the 

20 navigation. The OSD circuit is controlled by the Central Unit 3 and a navigator 
12. The navigator 12 is advantageously embodied in the form of a program 
module recorded in a read only memory. It may also be embodied in the form of 
a specialized circuit of ASIC type for example. 

The digital bus 6 and/or the transmission network transmit audio 

25 contents to the receiver either in digital form, or in analogue form, the receiver 
recording them in a memory 9. According to a preferred embodiment, the audio 
contents are received in digital form, preferably coded according to a 
compression standard, MP3 for example, and stored in the same form. 
According to this preferred embodiment, the memory 9 is a large-capacity hard 

30 disk, 40 gigabytes for example. The storage of a minute of audio content in MP3 
occupying around 1 megabyte, such a disk is capable of recording 666 sound 
hours of document. The downloading of audio content is a well known 
technique which need not be explained in the present patent application. 

Once a certain number of audio contents have been stored in the 

35 memory 9. The user wants to reproduce them and to do so without too many 
manual interventions, he also wants the contents to follow one another with a 
similitude so as to maintain a harmonious ambiance. To do this, a software 
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module of the navigator analyses each audio content during its reception and 
extracts the low-level parameters therefrom. As we indicated in the preamble, 
numerous signal analysis techniques exist which make it possible to obtain 
arrays of digital descriptors for these songs. The number of elements of a 
5 descriptor is of the order of a few tens. 

The array contained in the screen page of Figure 2 presents the values 
of low-level parameters constituting the descriptors of a certain number of audio 
documents. The first column of the array presents the title of the audio content, 
each content is numbered. The subsequent columns present the values of low- 

10 level parameters associated with the document, such as the mean sound 
intensity, the tempo, the energy, the zero crossing rate, the brightness, the 
envelope, the bandwidth, the loudness, the cepstral coefficients, etc. 

According to an improvement, the low-level parameters may be 
provided in digital form together with the audio content. When the content is 

is provided by a means of digital transmission and in compressed form, the 
associated low-level parameters constituting a field attached to trie audio 
content. This solution is particularly advantageous since the calculation of the 
parameters is performed by the producer or the provider of the content and not 
by the user, and hence it is carried out once only. 

20 Be they downloaded or calculated locally, the descriptors are stored in 

the memory 9 and then utilized to create groups of documents possessing 
certain similitudes. According to a first approach, the grouping of the contents 
into coherent groups (or clusters) may be carried out with the aid of a so-called 
"clustering" algorithm, for example the k-means algorithm (Mac Queen, "Some 

25 Methods for classification and analysis of multivariate observations", Proc Fifth 
Berkeley Symposium on Math., Stat, and Prob., vol1, pp 281-296, 1967.) The 
array of descriptors of Figure 2 possesses a new column defining the group in 
which the content is situated. Group calculation techniques are well known, 
using the k-means algorithm the number of groups thus produced can easily be 

30 controlled. 

According to a second approach, the groups are determined by a prior 
choice of classes (for example: mood, dominant instruments, tempo, etc.) and a 
ground truth helping to define these classes. 

Once the documents have been classed within the various groups, the 
35 program will then determine one or more representative documents, or 
representatives of the said group. 
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One way of proceeding consists in positioning identifier points Pi 
identifying each document of a group in a multidimensional space and in 
calculating the document situated nearest the equibarycentre of the set of these 
points. The equibarycentre is the centre of gravity of a set of points possessing 
5 the same mass. The positions of the points associated with each document are 
obtained on the basis of the low-level parameters, the space containing these 
points possesses as many dimension as the document possesses low-level 
parameters. 

A projection onto a two-dimensional space can be used to clearly 

10 explain the principle. Figure 3 represents a two-dimensional space where the 
points corresponding to three groups of documents, denoted A B and C, are 
disposed. The coordinates (xi, yi) of each point are obtained by projecting the 
point Pi onto a space of dimension 2. The projection is determined by principal 
component analysis or PCA. PCA is described in particular in the Saporta 

15 document 1990, entitled "Probabilites Analyse de donnees et statistiques, 
Edition Technip." [Probabilities data analysis and statistics, published by 
Technip]. This well-known data analysis algorithm seeks to discover a 
subsystem of axes that is linearly tied to the original which best "spreads" the 
samples, these axes tend to merge the correlated original axes. The low-levej 

20 descriptors being assumed to have perceptible coherence (close sounds can be 
perceived if and only if the values of the low-level descriptors are close), and 
the projection being continuous, the sound documents associated with close 
points within the space of dimension 2, resemble one another from the auditory 
standpoint. The same example can be applied to a space of dimension 3, using 

25 a projection in such a space. 

The calculation of the equibarycentre applied to the three sets leads to 
the determination of three points GA, GB and GC, which are situated 
approximately at the centre of each contour delimiting the groups A, B, and C 
such as shown in Figure 3. According to the present exemplary embodiment, 

30 the document whose point (xi, yi) is closest to the equibarycentre of a group is 
regarded as the representative of the group. 

The step consisting in projecting the points onto a one-, two- or three- 
dimensional space makes it possible to create a graphical representation of the 
collection of documents accessible from an apparatus. Moreover, the 

35 calculations of distance between the equibarycentre and each point associated 
with a document of a group is simpler, since the number of dimensions of the 
projection space is markedly less than the number of low-level parameters. 
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Depending on the membership of this or that group, the point associated with 
the document is of a certain shape (as shown in Figure 3), or of a certain colour, 
or any other distinctive graphical characteristic. A graphical representation such 
as this constitutes together with a keypad a user interface making it possible to 
5 select any point whatsoever within a group. To do this, the user can jump from 
one point to another by indicating a direction of navigation with the aid of the 
direction keys. 

However, the step of projection onto a one-, two- or three-dimensional 
space is optional, since it is perfectly possible to determine the equibarycentre 

10 of a group of points disposed in a multidimensional space, likewise it is possible 
to calculate the distances separating any point whatsoever of the group with the 
equibarycentre. In this case, it is difficult to represent the documents by points, 
the graphical interface then presents only graphical identifiers of groups. Such 
an example of a graphical interface is represented in Figure 4. 

15 Depicted in Figure 4 is a screen background image and a set of 

graphical identifiers of groups. A graphical identifier of a group is an icon 
containing a number varying from 1 to the number of groups calculated during 
the step of determining groups. These identifiers are joined by a graphical link 
giving the user an indication of the navigation command to be activated to 

20 change groups. In the example illustrated in Figure 3, group 7 is selected, by 
pressing the t> direction key, group 6 is selected, and by pressing the sU 
direction key, group 8 is selected. The icon containing the current group (group 
7 in Figure 4) is emphasized by a bolder contour, or by highlighting, or by a 
flashing or else a coloured background. If the icons are disposed horizontally, 

25 the user uses the -> and <- direction keys to change groups. 

When the user navigates groupwise, the apparatus reproduces the 
sound document representing the group. In this way, the user can in an auditory 
manner ascertain the genre of sound or of music which is common to the set of 
documents of the group. A variant consists in the fact that a determined number 

30 of sound documents represent the group. According to this variant, these 
documents are reproduced loopwise when the group is selected. The 
representative documents are for example those situated at a distance less 
than a determined value from the equibarycentre. An improvement of this 
variant consists in the fact that the user himself determines the number of each 

35 group's representative documents. In this way, the user may instigate the 
reproduction of a significant number of documents having auditory continuity 
and this have to select them manually. The first document selected by the 
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program as representative is that of the group whose distance is smallest from 
the equibarycentre, then the second, then the third and so on and so forth. 
When the number programmed by the user is reached, the program selects the 
first document. 

5 Another improvement consists in reproducing only an extract of each 

document. The duration of each extract may be defined by the program, or 
advantageously, the user programs this duration. In this way, the user can 
rapidly get an idea of the genre of sound documents located in the group. 

When a group is selected, the user presses the "sound documents" key 

10 to select each document of the group and thus activate its sound reproduction. 
He can then go from one document to another by virtue of the -> and <- 
direction keys. If the graphical interface so permits, the title of the sound 
document is displayed. Advantageously, the titles of the two documents situated 
immediately before (selectable by the <- key) and after (selectable by the 

15 key) are also displayed. The user can thus ascertain the two documents directly 
reproducible on the basis of the current document. 

In the foregoing, an embodiment applied to an apparatus having a 
means of display (2) was described. This means making it possible to 
graphically reproduce the identifier of the document representing a group of 

20 documents having sound similitude. According to another embodiment, the 
apparatus does not have a refined display means, allowing him to display at 
least the group identifiers. 

Such an apparatus is described by Figure 5, and the manner of 
operation of a player for reproducing audio documents 5.1 will firstly be 

25 described. This player is portable and stand-alone, it has a battery 5.2, a 
Central Unit 5.3 (UC) linked to a program memory 5.12, and has a keypad 5.8 
allowing the user to introduce all the commands required for the reproduction of 
the audio contents, an audio interface 5.10 comprising at least one D/A 
converter, at least one preamplifier whose gain is adjustable by the UC 5.3 and 

30 an amplifier dispatching the amplified sound signals to at least two 
loudspeakers 5.1 1. The keypad 5.8 has four direction keys and a rotary element 
making it possible to introduce a leftward or rightward rotational motion, 
conventional commands for reproducing a sound document (play, fast forward, 
fast rewind, stop, volume adjustment), a rotary selector and at least one 

35 thumbwheel. The loudspeakers 5.11 are connected to the player, they may be 
earphones on a headset worn by the user. The audio contents are 
advantageously recorded in a hard disk 5.9, but any other recording medium 
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may suit, in particular removable media (audio CD, DVD, magnetic cartridge, 
electronic card, etc). The audio contents may be downloaded into the hard disk 
5.9 in the same way as that described for Figure 1. The downloading of an 
audio content is a well-known technique that it is unnecessary to explain in the 
5 present document. 

Once a certain number of audio contents have been stored in the 
memory 5.9, the user wishes to select them and reproduce them. To do this, the 
program analyses each audio content and extracts therefrom the low-level 
parameters. The signal analysis techniques are identical to those indicated 

10 previously for the apparatus described by Figure 1 . 

According to an example of this second embodiment of the invention, 
the sound documents Di accessible from the player are virtually represented by 
points Pi disposed in a sound space with n dimensions. For the sake of 
simplicity and comprehension, this second exemplary embodiment uses a 

15 sound space with two dimensions. The layout of Figure 6 illustrates such an 
arrangement. The positions of the points Pi, defined by their coordinates (xi, yi) 
within the sound space, are calculated on the basis of the low-level parameters. 
According to the example of Figure 3, a point Pi is an identifier representing a 
sound document Si. The coordinates (xi, yi) are obtained by projecting the point 

20 Pi whose coordinates are the values of the low-level descriptors onto a sound 
sample, onto a space of dimension 2, 3, etc., depending on the type of 
representation chosen. The projection from the space of descriptors to this two- 
dimensional space is determined through principal component analysis or PCA. 
PCA is described in particular in the Saporta document 1990, entitled 

25 "Probabilites Analyse de donnees et statistiques, Edition Technip" [Probabilities 
data analysis and statistics, published by Technip]. This data analysis algorithm 
is aimed at determining a subsystem of axes that is linearly tied to the original 
which best "spreads" the documents, the axes tend to merge the correlated 
original axes. In this way, the program can analyse the sound documents and 

30 itself determines principal dimensions it is then the program which chooses the 
number of dimensions of the sound space. According to this technique, the 
document collection can be represented by a space with more than two 
dimensions. It is thus possible to create a sound space with three dimensions in 
which the user moves around. In this case, the installation must be equipped 

35 with additional loudspeakers 5.11, and they must be arranged high up and low 
down so as to give the user the impression that the sound is also coming from 
high up or from low down. The low-level descriptors being assumed to have a 
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perceptible coherence and the projection being continuous, the close points 
correspond to perceptually close sounds. In a general manner, the coordinates 
{Xj, y 2 , zi} of a point Pi in a multidimensional space allow the user to determine 
the type of associated sound document. Specifically, the positions of the points 
5 Pi being calculated as a function of the values of low-level parameters, if two 
points are graphically distant, the values of the low-level parameters of the two 
sound documents identified by these two points are very different and hence, 
the type of the sound content is different, for example a piece of classical music 
and a political speech. On the other hand, if two points are close, then so also 

10 are the types of the associated sound documents from the auditory standpoint. 

The user selects a document within the sound space through the 
auditory perception that the player generates. To do this, the player positions 
the user at the centre of the sound space, at a point Pu with coordinates (xu, 
yu), and selects the audio documents whose points Pi are nearest the position 

15 (xu, yu) with a view to reproducing them. Through its auditory perception, the 
user is aware of the sound space, and can orient himself towards a document 
Di with the aid of the sound "emitted" by the point Pi associated with this 
document, by actuating the key which gives the direction of the loudspeaker 1 1 
reproducing this document with the loudest intensity. 

20 The layout of Figure 7 illustrates the details of the audio interface 5.10. 

The audio interface 5.10 is composed of two identical parts, one for 
reproduction on the left earphone 5.11 and the other for the right earphone 
5.1 1. The number of documents selected by the program must be small, five for 
example. For each channel, the UC 5.3 associated with its program recorded in 

25 the memory 5.12 controls five selectors S1, S2, S3, S4 and S5 whose functions 
are to select a document from the set of audio documents of the memory 5.9 
and to reproduce it. The five audio signals selected by the selectors Si are 
transmitted respectively to five preamplifiers A1, A2, A3, A4 and A5 whose 
gains are controlled by the UC 5.3. The gain of a preamplifier Ai reproducing an 

30 audio document Di is proportional to the distance between the sound space 
separating the point (xu, yu) and the point Pi, with coordinates (xi, yi) associated 
with this document. The gain also depends on the direction in which the point 
(xi, yi) is situated with respect to a straight line starting from the point (xu, yu) in 
the direction ahead of the user placed in the sound space. This straight line is 

35 represented by an arrow in Figure 7. So that, all the documents whose points Di 
are situated to the left of the user in the sound space are reproduced by the left 
channel, and those situated to the right are reproduced by the right channel. 
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Moreover, the gain is all the larger as the angle between the segment formed of 
the points Pi and Pu, and the straight line Du representing the direction ahead 
of the user. If the document is dead ahead of the user, the point Pi is therefore 
on this straight line Du so the user hears the audio content of this point equally 
5 well to the left and to the right. Finally, the five signals emitted by the 
preamplifiers are mixed in an adder amplifier and amplified before being 
dispatched to the earphones or loudspeakers 5.1 1 . 

Thus, the user hears different audio contents to the left and to the right 
of his ears. As a function of the sound signals, he can steer to the left or to the 

10 right with the aid of the direction keys placed on the keypad 5.8, and orient 
himself towards a point corresponding to a content Di which he wishes to listen 
to. When the point (xu, yu) is situated at the same location as the point (xi, yi) 
corresponding to the sound document Di, or is close to it by at most a 
determined distance, the document is regarded as selected and reproduced in 

15 stereo on the two earphones 5.11, the other four documents are no longer 
reproduced. If the user presses the direction keys and moves away from the 
document that he has just listened to, the program then reproduces the five 
documents closest to the point (xu, yu) with the weightings corresponding to 
distance and to direction. 

20 A variant consists in implementing a "Select" key on the keypad 5.8 of 

the player 5.1. When the user presses this key, the program selects the sound 
document closest to the point (xu, yu) where the user is virtually located and 
instructs reproduction thereof to the exclusion of any other document. The 
position (xu, yu) is stored in memory so that a second press of the "Select" key 

25 causes a return to the previous state when the five sound documents closest to 
the position of the point (xu, yu) are reproduced. 

We shall now describe improvements which will aid the user to navigate 
within the sound space. 

The five documents closest to the point associated with the user are 

30 also close auditorily speaking, so that it is not easy for the user to determine an 
axis of movement as a function of a particular type of music for example. A first 
improvement consists in determining groups of sound documents having 
auditory coherence, and in reproducing one or more so-called "representative" 
documents of each group. The determination of the groups may be performed 

35 as was described previously, for example by comparing the values contained in 
the descriptors of the sound documents, whether they be downloaded or 
calculated locally, and by grouping those whose values are close. 
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In a manner that is particularly simple to calculate, the representative of 
a group is the audio document whose point is situated closest to the centre of 
the nebula of points of each audio document of the group. Its identifier is the 
audio content. According to a variant, the representative is a succession of 
5 documents or of extracts of the documents of the group, the identifier is then a 
sound content constituted by the successive reproduction of extracts of each 
document representing the group, each extract being reproduced for 10 
seconds for example. The extracts are reproduced loopwise. According to 
another variant, the program produces a synthetic sound calculated on the 

10 basis of an average of the low-level parameters characteristic of the sound 
documents of the group. 

The assignment of a document to a determined group is performed by 
adding a new column to the array of descriptors of Figure 2, this new column 
contains the number identifying the group to which the document belongs. In 

15 Figure 6, four groups have been identified by contours. When the user wishes 
to navigate around groups, he presses a key, called "Group", of the player and 
according to the example illustrated by this figure, the four documents most 
representative of each group are reproduced (these four documents appear in 
Figure 6 with a bold contour). This mode of navigation is deactivated by 

20 pressing the "Group" key again. By firstly navigating from one group to another, 
the user rapidly selects the type of audio content that he wants, then by 
deactivating the mode, he navigates from close document to close document 
within this group. By actuating the rotary element disposed on the keypad 5.8, 
the user remains on the same point Pu of the sound space and changes the 

25 direction indicated by the arrow in Figure 6. Thus, while remaining on the spot 
the user can search for a direction of movement, halt his rotation when the type 
of music which perceives ahead of him and then orient himself in this direction. 

A variant of the "group" key consists in regarding the speed of 
movement as a means of selection of the mode of navigation and of the way of 

30 calculating the groups. The user moves by pressing the four direction keys, 
when he presses a key for a long time or successively and rapidly, the program 
considers that the user wants to increase the speed of movement. A single and 
short press on a key makes it possible to return to a normal speed of 
movement. A variant consists in implementing a thumbwheel on the keypad 5.8 

35 enabling the user to determine the speed finely. In case of rapid movement, the 
program creates few groups of large size. These groups containing numerous 
songs, the representatives that the user will hear will necessarily give only an 
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approximate idea of the content of the groups. If the user slows down his speed 
of movement, the program will create smaller groups and hence permit the user 
finer selection. In this case, it is unnecessary to calculate groups for the whole 
set of songs but only within the neighbourhood of the user. These groups being 
5 defined more finely, the representatives are more faithful to the content of the 
groups. When the speed is a minimum, only the closest documents are 
reproduced and thus the mode of navigation from close documents to close 
documents is regained. 

10 Although the present invention has been described with reference to the 

particular embodiments illustrated, it is in no way limited by these embodiments, 
but is so only by the appended claims. It should be noted that changes or 
modifications may be made by the person skilled in the art. 
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